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4.  INTRODUCTION 

The  purpose  of  this  project  is  to  develop  a  database  of  digital  mammograms  that  can  be 
used  by  researchers  who  (1)  are  trying  to  determine  the  image  quality  requirements  of  detectors 
for  digital  mammography;  (2)  are  developing  image  processing  techniques  to  optimize  the 
displayed  digital  mammogram;  (3)  are  developing  computerized  methods  for  analyzing 
mammograms;  (4)  are  studying  the  effects  of  image  compression  methods  on  image  quality;  (5) 
are  developing  methods  for  remote  transmission  of  mammograms;  and  (6)  are  studying  the 
relationship  between  image  quality  and  diagnostic  accuracy.  This  database  also  could  be  used  as 
a  resource  for  teaching  radiology  residents  and  for  testing  the  performance  levels  of 
mammographers. 

The  specific  aims  of  this  proposal  are: 

1.  Collect  and  digitize  200  cases  in  each  of  5  different  categories,  mammograms  exhibiting:  (i) 
clustered  microcalcifications,  (ii)  masses,  (iii)  architectural  distortions,  (iv)  asymmetric  densities, 
and  (v)  no  lesions  (i.e.  normals). 

2.  Make  these  cases  available  to  other  researchers  either  over  computer  network  (Internet)  or  by 
sending  images  on  computer  tape  or  CD.  The  database  will  be  distributed  as  widely  as  possible 
so  that  comparisons  of  different  computerized  analysis  techniques  can  be  standardized. 

5.  BODY 


This  research  is  being  fimded  as  an  infrastructure  award  and  as  such,  it  does  not  represent 
a  research  project  per  se.  That  is,  there  is  no  hypothesis  that  we  are  trying  to  prove.  Therefore, 
this  report  is  structured  slightly  different  from  a  normal  scientific  research  report  -  heavy  on  the 
method  and  light  on  actual  results.  In  this  project,  the  procedure  is  the  most  important 
component,  which  is  applied  continuously  in  a  straightforward  manner  to  achieve  the  goal  of 
creating  the  database  of  mammograms. 


Task  1 :  Collect  and  digitize  mammograms.  (See  Figure  1 .) 

We  now  have  669  cases  digitized  (see  Table  I,  at  the  end  of  the  report).  We  have  reached 
or  surpassed  our  target  of 200  cases  of  calcifications,  masses  and  normal  cases.  We  will  not 
reach  our  goal  of  200  cases  of  asymmetries  and  archecteural  distortion  because  they  types  of 
legion  are  less  common  than  masses  and  calcifications. 

The  stumbling  block  to  this  project  remains  marking  the  exact  location  of  the  lesions. 

The  current  available  database  from  the  University  of  South  Florida,  well  large,  does  not  have  the 
truth  marked  exactly.  This  makes  it  difficult  to  use  and  to  compare  results  between  different 
researchers.  Each  research  group  marks  their  own  truth  and  it  will  be  different  from  any  other 
group’s  truth.  Therefore,  well  groups  are  using  the  same  cases,  they  are  not  using  the  same  truth. 
This  makes  intercomparison  of  techniques  problematic.  We  will  continue  to  encourage  the 
radiologists  in  our  department  to  help  us  mark  truth. 
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It  is  important  that  this  database  be  used  in  such  a  way  so  as  comparisons  between 
different  algorithms  can  be  made.  This  is  the  main  motivation  for  creating  such  a  database.  To 
allow  for  valid  comparisons  to  be  made,  two  things  are  needed:  (i)  the  exact  same  cases  need  to 
be  used  to  measure  performance,  and  (ii)  the  same  criteria  for  scoring  the  results  need  to  be  used. 
To  this  end,  we  have  divided  the  database  into  testing  and  training  cases.  Furthermore,  we  are 
preparing  recommendations  for  scoring  the  results  based  on  some  preliminary  results  in  our  lab. 
[1]  This  information  will  be  released  with  the  database.  No  other  database  offers  such 
instructions  for  its  use  and  thus  comparisons  of  different  techniques  are  still  difficult  to  do.  We 
are  in  the  process  of  specifying  an  objective  scoring  method  based  on  radiologists’  input.  Once 
we  have  the  scoring  method  developed  and  truth  marked,  we  will  be  begin  to  distribute  the 
database  widely.  The  study  is  xmderway.  Preliminary  results  are  inconclusive  at  this  stage. 


Figure  1 .  A  flowchart  of  the  steps  required  to  collect,  digitize,  archive,  and  distribute 
the  mammographic  database.  The  'Full  Image'  is  the  whole  digitized  mammogram  at  full 
resolution.  The  'Reduced  Image'  is  a  minified  version  (reduced  resolution)  of  the  full 
image.  The  'ROI  Image'  is  a  portion  of  the  full  image  at  full  resolution. 
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Task  2:  Establish  protocol  for  transmitting  database 

We  originally  considered  the  ACR/NEMA  (DICOM)  image  format  for  our  database. 
However,  when  we  began  our  work,  the  ACR/NEMA  format  did  not  have  a  module  for 
mammography,  and  it  would  have  been  an  extensive  project  to  develop  one  at  that  time. 
Recently  a  digital  mammography  module  has  been  approved.  We  have  at  this  time  decided  not 
to  use  the  DICOM  format,  since  many  users  of  our  database  may  not  have  a  DICOM  reader. 


Task  3:  Maintain  database  and  distribute  cases 


Maintenance  of  the  database  and  distribution  of  the  database  are  at  a  minimum  currently. 
These  tasks  will  become  important  shortly  as  cases  go  “on-line”.  Cases  are  being  archived  on  4- 
mm  tape  and  DVD. 


6.  KEY  RESEARCH  ACCOMPLISHMENTS 


•  Collection  of  669  mammographic  cases 

7.  REPORTABLE  OUTCOMES 

Presentations  and  Manuscripts: 

1 .  Nishikawa  RM,  Wolverton  DE,  Schmidt  RE,  Johnson  RE,  Pisano,  ED,  Hemminger  BM:  A 
common  database  of  mammograms  for  research  in  digital  mammography.  U.S.  Army 
Medical  Research  and  Materiel  Command  Breast  Cancer  Research  Program:  An  Era  of  Hope, 
November,  1997,  Washington,  DC. 

2.  Nishikawa  RM,  Wolverton  DE,  Schmidt  RA,  Pisano  ED,  Hemminger  BM,  Moody  J :  A 
common  database  of  mammograms  for  research  in  digital  mammography.  In:  Doi  K,  Giger 
ML,  Nishikawa  RM,  and  Schmidt  RA  (eds.).  Digital  Mammography  ‘96.  (Amsterdam: 
Elsevier  Science)  435-438, 1996. 

3.  Nishikawa  RM:  Mammographic  databases.  Breast  Disease  10  137-150, 1998. 


8.  CONCLUSIONS 


The  development  of  a  common  database  of  mammograms  for  digital  mammography 
research  is  underway.  We  are  currently  establishing  an  objective  scoring  method  based  on 
radiologists’  input.  We  will  distribute  cases  and  the  scoring  method  in  order  to  insure  that 
meaningful  comparisons  between  different  techniques  can  be  made.  Such  comparisons  are 
currently  not  possible  or  are  problematic  with  any  existing  database. 
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Table  I.  Breakdown  of  cases  in  the  database  as  of  October  1/02. 


Type  of  Lesion 

Pathology 

#  of  Cases 

Mass 

Malignant 

121 

Mass 

Benign 

75 

Microcalcifications 

Malignant 

120 

Microcalcifications 

Benign 

87 

Asymmetric  Density 

Malignant 

30 

Asymmetric  Density 

Benign 

4 

Architectural 

Distortion 

Malignant 

32 

Architectural 

Distortion 

Benign 

3 

Normal 

200 

Total 

669 

